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1 Introduction 



Many socio-economic and biological processes can be modeled as systems of interacting indi- 
viduals; see for example papers in econophysics bulletin [1]. One may then try to derive their 
global behavior from individual interactions between their basic entities. Such approach is fun- 
damental in statistical physics which deals with systems of many interacting particles. We will 
explore similarities and differences between systems of many interacting players maximizing 
their individual payoffs and particles minimizing their interaction energy. 

Here we will consider game-theoretic models of many interacting agents [2, 3, 4]. In such 
models, agents have at their disposal certain strategies and their payoffs in a game depend 
on strategies chosen both by them and by their opponents. A configuration of a system, 
that is an assignment of strategies to agents, is a Nash equilibrium if for any agent, for fixed 
strategies of his opponents, changing the current strategy will not increase his payoff. One of the 
fundamental problems in game theory is the equilibrium selection in games with multiple Nash 
equilibria. In two-player games with two strategies we may have two Nash equilibria: a payoff 
dominant (also called efficient) and a risk-dominant one. In the efficient equilibrium, players 
receive highest possible payoffs. The strategy is risk-dominant if it has a higher expected payoff 
against a player playing both strategies with equal probabilities. It is played by individuals 
averse to risks. 

One of the selection methods is to construct a dynamical system where in the long run only 
one equilibrium is played with a high frequency. Here we will discuss adaptive dynamics where 
agents adapt in some optimal way to the environment created by other players and with a small 
probability, representing the noise of the system, they make mistakes. To describe the long-run 
behavior of such stochastic dynamics, Foster and Young [5] introduced a concept of stochastic 
stability. A configuration of a system is stochastically stable if it has a positive probability 
in the stationary state of the above dynamics in the limit of zero noise. It means that in the 
long run we observe it with a positive frequency. 

The main goal of this paper is to review results concerning the dependence of the long-run 
behaviour of the system on the number of players and the noise level. We will show that in 
many games, when the number of players or the noise level increases, the population undergoes 
a transition between its equilibria. 

In spatial games, players are located on vertices of certain graphs and they interact only 
with their neighbors [6, 7, 8, 9, 10, 11, 12, 13, 14, 15]. In discrete moments of time, players 
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adapt to their neighbors by choosing with a high probability the strategy which is the best 
response, i.e. the one which maximizes the sum of the payoffs obtained in individual games 
and with a small probability they make mistakes. Now, for any arbitrarily low but fixed noise, 
if the number of players is big enough, then the probability of any individual configuration is 
practically zero. It means that for a large number of players, to observe a stochastically stable 
configuration we must assume that players make mistakes with extremely small probabilities. 
On the other hand, it may happen that in the long run, for a low but fixed noise and sufficiently 
big number of players, the stationary state is highly concentrated on an ensemble consisting of 
one Nash configuration and its small perturbations, i.e. configurations where most players play 
the same strategy. We will call such configurations ensemble stable. 

We will consider here games with symmetric Nash equilibria and homogeneous Nash config- 
urations. By the stochastic stability of a strategy or a Nash equilibrium we mean the stochastic 
stability of the corresponding Nash configuration. We will present examples of spatial games 
with three strategies where concepts of stochastic stability and ensemble stability do not co- 
incide [16]. In particular, we may have the situation where a stochastically stable strategy is 
played in the long run with an arbitrarily low frequency. We discuss also an effect of adding a 
dominated strategy to a game with two strategies. In particular, the presence of such a strategy 
may cause a risk and payoff-dominant strategy to be observed in the long run with a frequency 
close to zero. In above models, when the number of players or the noise level increases, a 
population undergoes a transition between its equilibria. 

We will also review two models of adaptive dynamics of a darwinian type where states 
of a population are characterized by a number of individuals playing the first strategy. In 
both models, the selection part of the dynamics ensures that if the mean payoff of a given 
strategy at the time t is bigger than the mean payoff of the other one, then the number of 
individuals playing the given strategy should increase in t + 1. In the first model, introduced 
by Kandori, Mailath and Rob [17], one assumes (as in the standard replicator dynamics) that 
individuals receive average payoffs with respect to all possible opponents - they play against the 
average strategy. In the second model, introduced by Robson and Vega-Redondo [18, 19], at 
any moment of time, individuals play only one game with randomly chosen opponents. In both 
models, players may mutate with a small probability hence the population may move against 
a selection pressure. 

It was shown that in the Kandori-Mailath-Rob model, the risk-dominant strategy is stochas- 
tically stable - if the mutation level is small enough we observe it in the long run with the 
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frequency close to one [17]. In the model of Robson and Vega-Redondo, the payoff-dominant 
strategy is stochastically stable. It is one of very few models in which a payoff-dominant strat- 
egy is stochastically stable in the presence of a risk-dominant one. We showed in [20] that in 
the sequential dynamics in the model of Robson and Vega-Redondo, for any arbitrarily low but 
a fixed level of mutations, if the number of players is sufficiently big, a risk-dominant strategy 
is played in the long run with a frequency close to one. More precisely, we proved that an en- 
semble consisiting of states of the population in which only its small fraction play the efficient 
strategy has the probability close to one in the stationary state. Again, in the limit of the 
infinite number of players, one has to consider the ensemble stability rather than the stability 
of individual configurations. 

We showed that stochastically stable efficient strategy is observed with a very low frequency. 
It means that when the number of players increases, the population undergoes a transition from 
the efficient equilibrium to the risk-dominant one. 

In Section 2, we will introduce basic notions of game theory. In Section 3, spatial games 
are discussed. In Section 4, we compare ground-state configurations in lattice-gas models and 
Nash configurations in spatial games. In Section 5, we introduce a concept of the ensemble 
stability and present an example in which a stochastically stable strategy is played in the long 
run with a frequency close to zero. In Section 6, we review results concerning stochastic and 
ensemble stability in adaptive dynamics. 

2 Nash equilibria 

To characterize a game-theoretic model one has to specify the set of players, strategies they have 
at their disposal and payoffs they receive. Although in many models the number of players 
is very large, their strategic interactions are usually decomposed into a sum of two-player 
games. Only recently there have appeared some systematic studies of truly multi-player games 
[21, 22, 23, 24, 25]. Here we will consider only two-player games with two or three strategies. 
The payoff of any player depends not only on his strategy but also on strategies played by his 
opponents. It can be represented by a kxk matrix, where k is the number of strategies. Let us 
begin by describing a game with two strategies and two symmetric Nash equilibria. A generic 
payoff matrix is given by 
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Example 1 

A B 

A a b 

U = 

Bed, 

where the ij entry, i,j = A,B, is the payoff of the first (row) player when he plays the 
strategy % and the second (column) player plays the strategy j. We assume that both players 
are the same and hence payoffs of the column player are given by the matrix transposed to U ; 
such games are called symmetric. 

An assignment of strategies to both players is a Nash equilibrium, if for each player, for a 
fixed strategy of his opponent, changing the current strategy will not increase his payoff. If 
a > c and d > b, then (A, A) and (B, B) are two Nash equilibria. If a + b < c + d, then the 
strategy B has a higher expected payoff against a player playing both strategies with equal 
probabilities. We say that B risk dominates the strategy A (the notion of the risk-dominance 
was introduced and thoroughly studied by Harsanyi and Selten [26]). If at the same time a > d, 
then we have a selection problem of choosing between the payoff-dominant (also caled efficient) 
equilibrium (A, A) and the risk-dominant (B,B). 

Games with symmetric payoff matrices are called doubly symmetric or potential games [27]. 
More generally, a game is called a potential game if its payoff matrix can be changed to a 
symmetric one by adding payoffs to its columns. Such payoff transformation does not change 
strategic character of the game, in particular it does not change the set of its equilibria. More 
formally, it means that there exists a symmetric matrix V called a potential of the game such 
that for any three strategies A, B, and C, 

U(A, C) - U(B, C) = V(A, C) - V(B, C). (1) 

It is easy to see that every game with two strategies has a potential V with V{A, A) = a — c, 
V(B,B) = d — b, and V(A,B) = V{B,A) = 0. It follows that in two-player games with two 
strategies an equilibrium is risk-dominant if and only if it has a bigger potential. 

3 Spatial games with local interactions 

Let A be a finite subset of the simple lattice Z 2 (we may think about a square centered at 
the origin of the lattice). Every site of A is occupied by one player who has at his disposal 
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one of k different pure strategies. Let S be a set of pure strategies, then fi A = S A is the set 
of all possible configurations of players, that is all possible assignments of pure strategies to 
individual players. For every i e A, is the strategy of the i— th player in the configuration 
X E VIa an d X_j denotes strategies of all remaining players; X therefore can be represented as 
the pair (Xj, X-i). U : S x S — > i? is a matrix of payoffs of our stage game. [/ (A, B), A, B <E S 
is the payoff of the first (row) player playing the strategy A when the second one (a column 
player) is playing B. We will consider here only symmetric games so the payoff of the second 
player is given by U(B, A) (the payoff matrix of the second player is the transpose of the payoff 
matrix U of the first one). Every player interacts only with his neighbors and his payoff is 
the sum of the payoffs resulting from individual games. We assume that he has to use the 
same strategy for all neighbors. Let iVj denote the neighborhood of the i—th player. For the 
nearest-neighbor interaction we have Ni = {j; \j — — 1}, where \i — j\ is the distance between 
% and j. For X e Qa we denote by Vi(X) the payoff of the i—th player in the configuration X: 



Definition 1 X e fU is a Nash configuration if for every i e A and A e S Ui(Xi,X-i) > 



In this paper we will discuss only coordination games, where there are k pure symmetric 
Nash equilibria and therefore k homogeneous Nash configurations, where all players play the 
same strategy. 

We describe now the deterministic dynamics of the best-response rule. Namely, at each 
discrete moment of time t = 1,2,..., a randomly chosen player may update his strategy. He 
simply adopts the strategy, X\, which gives him the maximal total payoff Vi{X\, Xt^ 1 ) for given 
X!_7\ a configuration of strategies of remaining players at time t — 1. 

Now we allow players to make mistakes with a small probability, that is to say they may 
not choose the best response. A probability of making a mistake may depend on the state 
of the system (a configuration of strategies of neighboring players). We will assume that this 
probability is a decreasing function of the payoff lost as a result of a mistake [6]. In the log- 
linear rule, the probability of chosing by the i—th player the strategy X\ at time t is given 
by the following conditional probability: 



(2) 



i/i(A,X-i). 




e (l/T)MXlX^) 



(3) 
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where T > measures the noise level. 

Let us observe that if T — * 0, pj converges to the best-response rule. Our stochastic 
dynamics is an example of an ergodic Markov chain with |,S A | states. Therefore, it has a unique 
stationary state which we denote by 

The following definition was first introduced by Foster and Young [5]: 

Definition 2 X e VL\ is stochastically stable if lim T ^o Va(X) > 0. 

If X is stochastically stable, then the frequency of visiting X converges to a positive number 
along any time trajectory almost surely. It means that in the long run we observe X with a 
positive frequency. In most models it is usually equal to 1. 

In examples below, we consider symmetric games with symmetric Nash equilibria and ho- 
mogeneous Nash configurations. By the stochastic stability of a strategy or a Nash equilibrium 
we mean the stochastic stability of the corresponding Nash configuration. 

4 Ground states and Nash configurations 

We will present here one of the basic models of interacting particles. In classical lattice-gas 
models, particles occupy lattice sites and interact only with their neighbors. The fundamental 
concept is that of a ground-state configuration. It can be formulated conveniently in the limit 
of an infinite lattice (the infinite number of particles). Let us assume that every site of the Z 2 
lattice can be occupied by one of k different particles. An infinite-lattice configuration is an 
assignment of particles to lattice sites, i.e., an element of Q = {1, k} 7,2 . If X e Q, and i € Z 2 , 
then we denote by Xi a restriction of X to %. We will assume here that only nearest-neighbor 
particles interact. The energy of their interaction is given by a symmetric k x k matrix V. An 
element V(A, B) is the interaction energy of two nearest-neighbor particles of type A and B. 
The total energy of a system in a configuration X in a finite region A can be then written as 

H\(X) = J2(i,j)eA V ( x i^ x j)- 

Y is a local excitation of X, Y ~ X, Y, X e d, , if there exists a finite A C Z d such that 
X = Y outside A. 

For Y ~ X, the relative energy is defined by 

H(Y,X) = E(m^) - V{X u X s )l (4) 

(id) 

where the summation is with respect to pairs of nearest neighbors on Z 2 . Observe that this is 
the finite sum; the energy difference between Y and X is equal to outside some finite A. 
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Definition 3 X e Vt is a ground-state configuration of V if 

H(Y, X) > for any Y ~ X. 

That is, we cannot lower the energy of a ground-state configuration by changing it locally. 
The energy density e(X) of a configuration X is 

e(X) = liminf g f A (X) , (5) 
A-^Z2 |A| 

where | A| is the number of lattice sites in A. It can be shown that any ground-state configuration 
has the minimal energy density [30] . It means that local conditions present in the definition of 
a ground-state configuration force global minimization of the energy density. 

As we see, the concept of a ground-state configuration is very similar to that of a Nash 
configuration. We have to identify particles with agents, types of particles with strategies and 
instead of minimizing interaction energies we should maximize payoffs. There are however pro- 
found differences. First of all, ground-state configurations can be defined only for symmetric 
matrices; an interaction energy is assigned to a pair of particles, payoffs are assigned to indi- 
vidual players and may be different for each of them. Ground-state configurations are stable 
with respect to all local changes, Nash configurations are stable only with respect to one-player 
changes. It means that for the same symmetric matrix U, there may exist a configuration which 
is a Nash configuration but not a ground-state configuration for the interaction marix —U. The 
simplest example is given by the following matrix: 

Example 2 

A B 
A 2 

U = 

B 1 

(A, A) and (B, B) are Nash configurations for a system consisting of two players but only 
(A, A) is a ground-state configuration for V = —U. We may therefore consider the concept of 
a ground-state configuration as a refinement of a Nash equilibrium. 

For any classical lattice-gas model there exists at least one ground-state configuration. This 
can be seen in the following way. We start with an arbitrary configuration. If it cannot be 
changed locally to decrease its energy it is already a ground-state configuration. Otherwise we 
may change it locally and decrease the energy of the system. If our system is finite, then after 
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a finite number of steps we arrive at a ground-state configuration; at every step we decrease 
the energy of the system and for every finite system its possible energies form a finite set. For 
an infinite system, we have to proceed ad infinitum converging to a ground-state configuration 
(this follows from the compactness of Q). Game models are different. It may happen that a 
game with a nonsymmetric payoff matrix may not posess a Nash configuration. The classical 
example is that of the Rock-Scissors-Paper game given by the following matrix. 

Example 3 





R 


S 


P 


R 


1 


2 





S 





1 


2 


P 


2 





1 



One may show that this game dos not have any Nash configurations on Z and Z 2 but many 
Nash configurations on the triangular lattice. 

In short, ground-state configurations minimize the total energy of a particle system, Nash 
configurations do not necessarily maximize the total payoff of a society. 

Ground-state configuration is an equilibrium concept for systems of interacting particles at 
zero temperature. For positive temperatures, we must take into account fluctuations caused 
by thermal motions of particles. Equilibrium behavior of the system results then from the 
competition between its energy V and entropy S (which measures the number of configurations 
corresponding to a macroscopic state), i.e. the minimization of its free energy F = V — 
TS : where T is the temperature of the system - a measure of thermal motions. At the zero 
temperature, T = 0, the minimization of the free energy reduces to the minimization of the 
energy. This zero-temperature limit looks very similar to the zero-noise limit present in the 
definition of the stochastic stability. Equilibrium behavior of a system of interacting particles 
can be described by specifying probabilities of occurence for all particle configurations. More 
formally, it is described by a Gibbs state (see [31] and references therein). 

We construct it in the following way. Let A be a finite subset of Z 2 and p\ the following 
probability mass function on Q A = (1, k) A : 




(6) 



for every X G Q\, where 




E exp(-tf A (X)/T) 



(7) 
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is a normalizing factor and T is the temperature of the system. 

We define a Gibbs state as a limit of p\ as A — > Z 2 . One can prove that a limit of a 
translation-invariant Gibbs state for a given interaction as T — > is a measure supported by 
ground-state configurations. One of the fundamental problems of statistical mechanics is a 
characterization of low-temperature Gibbs states for given interactions between particles. 

Let us now come back to spatial games with players located on a finite region A of the Z 2 
lattice and receiving a payoff given by the matrix in Example 1. It has two homogeneous Nash 
configurations, X A and X B , in which all players play the same strategy A or B respectively. If 
V is a potential of the stage game, then V(Xi, Xf) is a potential of a configuration X 

in the corresponding spatial game. One can show [8] that 

Ezen A e {1/T) ^^ nZ " Z ^ 
is the unique stationary state of the log-linear dynamics. 

We may now explicitly perform the limit T — > and use (1) to obtain that the risk-dominant 
configuration X B is stochastically stable. 

5 Ensemble stability in spatial games 

The concept of stochastic stability involves individual configurations of players. In the zero- 
noise limit, the stationary state is usually concentrated on one or at most few configurations. 
However, for a low but fixed noise and for a big number of players, the probability of any 
individual configuration of players is practically zero. The stationary state, however, may be 
highly concentrated on an ensemble consisting of one Nash configuration and its small pertur- 
bations, i.e., configurations, where most players play the same strategy. Such configurations 
have relatively high probability in the stationary state. We call such configurations ensemble 
stable [16]. 

Definition 4 X £ il\ is e-ensemble stable if fJ%(Y G Qa; Yi ^ Xi) < e for any % e A if 

A D A(T) for some A(T). 

Definition 5 X E tt\ is low-noise ensemble stable if for every e > there exists T(e) such 
that if T < T{e), then X is e-ensemble stable. 
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If X is e-ensemble stable for some T > and e close to zero, then the ensemble consisting 
of X and configurations which are different from X at at most few sites has the probability 
close to one in the stationary state. It does not follow, however, that X is necessarily low-noise 
ensemble or stochastically stable as it happens in the following example. 

Players are located on a finite subset A of Z 2 (with periodic boundary conditions) and 
interact with their four nearest neighbors. They have at their disposal three strategies: A, B, 
and C. The payoffs are given by the following symmetric matrix: 

Example 4 

ABC 

AO 0.1 1 
U= B 0.1 2 + a 1 

C 1 1 2, 
where a > 0. 

Our game has two Nash equilibria: (B, B) and (C, C), and the corresponding spatial game 
has two homogeneous Nash configurations: X B and X c . For a — 0, both X B and X c are 
ground state-configurations for —U; for a > only X B is a ground-state configuration. 

Observe that the strategy A gives a player the lowest payoff regardless of a strategy chosen 
by his opponent. Such strategy is called dominated. It is easy to see that dominated strategies 
cannot be present in any Nash equilibrium. Therefore such strategies should not be used by 
players and consequently we might think that their presence should not have any impact on 
the long-run behavior of the system. We will see that this is not true in Example 4. 

The unique stationary state of the log-linear dynamics (3) is given in (8) with V — U. 
Let us start our discussion with a — 0. It follows from (8) that lim^^o A*a(^ A ') = 1/2, A 1 = 
B,C so B and C are stochastically stable. Let us investigate the long-run behavior of our 
system for large A, that is for a big number of players. Observe that lim A ^ Z 2 fJ^(X) = for 
every X e fl = S 7,2 . Hence for large A and T > we may only observe, with reasonable 
positive frequencies, ensembles of configurations and not particular configurations. We will 
be interested in ensembles which consist of a Nash configuration and its small perturbations, 
that is configurations, where most players adopt the same strategy. We perform first the limit 
A — > Z 2 and obtain an infinite volume Gibbs state 



11 



A-+Z 2 

We may then apply a technique developed by Bricmont and Slawny [32, 33]. They studied low- 
temperature stability of the so-called dominant ground-state configurations. It follows from 
their results that 



//(X, = C) > 1 - e(T) (10) 

for any % G Z 2 and e(T) -> as T -> [16]. 

The following theorem is a simple consequence of (10). 

Theorem 1 If a = 0, then X c is low-noise ensemble stable. 

We see that for any low but fixed T, if the number of players is big enough, then in the long 
run, almost all players use C strategy. On the other hand, if for any fixed number of players, 
T is lowered substantially, then B and C appear with frequencies close to 1/2. 

Let us sketch briefly the reason of such a behavior. While it is true that both X B and X c 
have the same potential which is the half of the payoff of the whole system (it plays the role 
of the total energy of a system of interacting particles), the X c Nash configuration has more 
lowest-cost excitations. Namely, one player can change its strategy and switch to either A or 
B and the total payoff will decrease by 8 units. Players in the X s Nash configuration have 
only one possibility, that is to switch to C; switching to A decreases the total payoff by 15.2. 
Now, the probability of the occurrence of any configuration in the Gibbs state (which is the 
stationary state of our stochastic dynamics) depends on the total payoff in an exponential way. 
One can prove that the probability of the ensemble consisting of the X c Nash configuration and 
configurations which are different from it at few sites only is much bigger than the probability 
of the analogous X B -ensemble. It follows from the fact that the X^-ensemble has many more 
configurations than the X B -ensemble. On the other hand, configurations which are outside 
X B and X^-ensembles appear with exponentially small probabilities. It means that for large 
enough systems (and small but not extremely small T) we observe in the stationary state 
the X c Nash configuration with perhaps few different strategies. The above argument was 
made into a rigorous proof for an infinite system of the closely related lattice-gas model (the 
Blume-Capel model) of interacting particles by Bricmont and Slawny in [32]. 

For a = 0, both X B and X c have the same total payoff. X° has lowest-cost fluctuations 
and therefore it is low- noise ensemble stable. For a > 0, X c has a smaller total payoff but 
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nevertheless one can prove [16] that in the long run C is played with a frequency close to 1 if 
the noise level is low but not extremely low. 

Theorem 2 For every e > 0, there exist a(e) and T(e) such that for every < a < a(e), there 
exists T(a) such that for T(a) < T < T(e) ; X c is e-ensemble stable, and for < T < T(a), 
X B is e-ensemble stable. 

Observe that for a = 0, both X B and X c are stochastically stable (they appear with the 
frequency 1/2 in the limit of zero noise) but X c is low-noise ensemble stable. For small a > 0, 
X B is both stochastically (it appears with the frequency 1 in the limit of zero noise) and low- 
noise ensemble stable. However, for intermediate noise T(a) < T < T(e), if the number of 
players is big enough, then in the long run, almost all players use C strategy - X c is ensemble 
stable). If we lower T below T(a), then almost all players start to use B strategy. We may say 
that at T = T(a) the society of players undergoes a phase transition from C to 5-behavior. 

Stochastic and ensemble stability of three-player spatial games were discussed in [34] and 
of some other spatial games in [35]. 

6 Stochastic and ensemble stability in adaptive dynam- 
ics 

Here we will review two models of darwinian adaptive dynamics. In both of them, the selection 
part of the dynamics ensures that if the mean payoff of a given strategy at the time t is bigger 
than the mean payoff of the other one, then the number of individuals playing the given strategy 
should increase in t+ 1. In the first model, introduced by Kandori, Mailath, and Rob [17], one 
assumes (as in the standard replicator dynamics [2, 3]) that individuals receive average payoffs 
with respect to all possible opponents - they play against the average strategy. In the second 
model, introduced by Robson and Vega-Redondo [18, 19], at any moment of time, individuals 
play only one game with randomly chosen opponents. In both models, players may mutate 
with a small probability hence the population may move against a selection pressure. 

In the above models, at every discrete moment of time t, the state of the population is 
described by the number of individuals, z t , playing the strategy A in the two-player symmetric 
game with the payoff matrix given in Example 1. Formally, by the state space we mean the 
set = {z,0 < z < n}. Now we will describe the dynamics of our system. It consists of two 
components: selection and mutation. Let TTi(z t ),i = A, B, denote the mean payoff of a strategy 
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at the time t. In their paper [17], Kandori, Mailath, and Rob write 



a(z t - 1) + bjn - z t ) 
n-1 



(11) 



ttb fa) 



cz t + d(n - z t - 1) 
n — 1 



provided < z t < n. 

It means that in every time step, players are paired infnitely many times to play the game or 
equivalently, each player plays with every other player and his payoff is the sum of corresponding 
payoffs. 

The selection dynamics is formalized in the following way: 



Now mutations are added. At every moment t, each player switches to a new strategy 
with some probability e. It is easy to see that for any two states of the population, there 
is a positive probability of the transition between them in some finite number of time steps. 
We have therefore obtained an irreducible Markov chain with n + 1 states. It has a unique 
stationary state (a probability mass function) which we denote by u^. The following theorem 
was proved in [17]. 

Theorem 3 For any large enough n, lim e ^ = 1 ^ n ^ e average strategy (K-M-R) model. 

It means that in the long run, in the limit of no mutations, all players play the risk-dominant 
strategy B, that is B is stochastically stable. 

We will outline the proof of the above theorem (see [17, 19]). It is based on the tree repre- 
sentation of stationary states of irreducible Markov chains [36, 37, 38] (see also the Appendix). 

We will first show that z = and z = n and possibly nx* (if it is an integer) are the only 
absorbing states of the mutation-free dynamics (the selection part). Let Ua and U B are the 
basins of attraction of z = n and z = respectively. The basins do not overlap and we have 
that Ua = {k, nx* < k < n} and Ub = {k, < k < nx*}, where x* = (d — b)/(d — b + a — c) is 



zt+i > z t if TT A (z t ) > TT B (z t ), 



(12) 



zt+i < z t if n A (zt) < n B (z t ), 



z t +i = z t if n A (zt) = n B (zt), 



z t+1 = z t if zt = or z t = n. 
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the solution of the equation tta = vr B , that is x* is the mixed Nash equilibrium of the game. If 
nx* is an integer, it is the additional invariant state of the dynamics. 

Now to prove Theorem 3 it is enough to notice that x* > 1/2 for the payoff matrix in 
Example 1 and therefore the population needs more mutations to move from z = to z = n 
than the other way around. Therefore q m {z = 0) has the lower order in e in (16) than q m {z = n) 
so B is stochastically stable. 

The general set up in the model of Robson and Vega-Redondo [18, 19] is the same. However, 
individuals are paired only once at every time step and play only one game before the selection 
process takes place. Let p t denote the random variable which describes the number of cross- 
pairings, i.e. the number of pairs of matched individuals playing different strategies at the 
time t. Let us notice that p t depends on z t . For a given realization of p t and z t , mean payoffs 
obtained by each strategy are as follows: 

~ / \ a(z t -p t ) + bpt 

K A {zt,Pt) = , (13) 

z t 

cp t + d(n -zt- p t ) 

KB{Z t ,Pt) = , 

n- z t 

provided < z t < n. 

Let us denote by jX e n the stationary state of the corresponding Markov chain. It was proved 
in [18, 19] that the payoff-dominant strategy is stochastically stable. 

Theorem 4 For any large enough n, lim e ^ /^n( n ) = 1 the random matching (R-V-R) model. 

Let us again outline the proof. Due to the stochastic nature of matching, basins of attraction 
overlap here. First of all, one can show that there exists k such that if n is large enough and 
Zt > k, then there is a positive probability (a certain realization of pt) that after a finite number 
of steps of the mutation-free selection dynamics, all players will play A. Likewise, if z t < k (for 
any k > 1, then if the number of players is large enough, then after a finite number of steps of 
the mutation- free selection dynamics all players will play B. In other words, z = and z = n 
are the only absorbing states of the mutation-free dynamics and there are no other recurrent 
classes. Moreover, if n is large enough, then if z t > n — k, then the mean payoff obtained by A 
is always (for any realization of p t ) bigger than the mean payoff obtained by B. (in the worst 
case all .B-players play with A-players). Therefore the size of the basin of attraction of the state 
z = is at most n — k — 1 an that of z = n is at least n — k. It follows that the system needs 
at least k + 1 mutations to evolve from z = n to z = and at most k mutations to evolve from 
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z = to z = n. Now q m [z = n) has the lower order in e in (16) than q m (z = 0) which finishes 
the proof. 

Stochastic stability is concerned with the limit of no mutations. We have shown [20] that for 
a low but fixed level of mutations in the sequential dynamics of the random matching model, 
if the number of players is large enough, then in the long run, the stochastically stable efficient 
strategy is played with the frequency close to zero. It is again the risk-dominant strategy, as 
in the Kandori-Mailath-Rob model, which is played with the frequency close to one. It means 
that when the number of players increases, the population undergoes a transition between its 
two equilibria. Of course, if the noise level is decreased for a new larger population, it settles 
again in the efficient equilibrium - after all the efficient equilibrium is stochastically stable. We 
have therefore proved in [20] that two limits of stationary distributions, the zero mutation and 
the infinite number of players limit are concentrated on two different equilibria. 

Let us describe our model in more detail. In sequential dynamics, in one time unit, only one 
player can change his strategy. The number of A-players in the population can increase by one 
in t + 1, if a 5-player is chosen in t which happens with the probability (n — z t )/n. Analogously, 
the number of 5-players in the population can increase by one in t + 1, if an A-player is chosen 
in t which happens with the probability (z t )/n. 

The player who has a revision opportunity will chose in t + 1 with the probability 1 — e the 
strategy with a higher average payoff in t and the other one with the probability e. 

The following theorem was proven in [20]. 

Theorem 5 In the sequential dynamics of the random matching model, for any 5 > and 
P > there exist e(5, (3) and n(e) such that for any n > n(e) 

\i e n (z < fin) > 1 - 5. 

Let us note that the above theorem concerns an ensemble of configurations, not an individual 
one. In the limit of the infinite number of players, that is the infinite number of configurations, 
every single configuration has zero probability in the stationary state. It is an ensemble of 
configurations that might be stable. 

We also showed in [20] that in the random matching model, stochastic stability itself may 
depend on the number of players. If the population consists of only one 5-player and n — 1 
A-players and if c > [a(n — 2) + b]/(n — 1), that is n < (2a — c — b)/(a — c), then ttb > ^ a- It 
means that one needs only one mutation to evolve from z = n to z = 0. It is easy to see that 



16 



two mutations are necessary to evolve from z = to z = n. Using again the tree representation 
of stationary states, (see the Appendix) one can prove the following theorem. 



Theorem 6 If n < 2a a ° c b , then B is stochastically stable. 



We already know that for any large enough n, efficient strategy A is stochastically stable. 
Again, the population changes its behavior when the number of players increases. However, 
the nature of this transition is different from the one described before. When the number 
of players increases, in order to see stochastically stable efficient strategy, the mutation level 
should decrease substantially. 
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under the grant KBN 5 P03A 025 20 is kindly acknowledged. 

7 Appendix 

The following tree representation of stationary distributions of Markov chains was proposed 
by Freidlin and Wentzell in [36, 37] (see also [38]). Let (fi, P) be an irreducible Markov chain 
with a state space f2 and transition probabilities given by P : H x Q -> [0,1]. It has a unique 
stationary distribution fi (called also a stationary state). For X e fl, let X-tree be a directed 
graph on Q such that from every Y ^ X there is a unique path to X and there are no outcoming 
edges out of X. Denote by T(X) the set of all X-trees and let 

?w= e n p (yyi (14) 

deT(X) (Y,Y')ed 

where the product is with respect to all edges of d. Now one can show [37] that 

" W = t3y) (15) 

for all X eil. 

A state is an absorbing one if it attracts nearby states in the noise-free dynamics. We asume 
that after a finite number of steps of the noise-free dynamics we arrive at one of the absorbing 
states (there are no other recurrence classes) and stay there forever. Then it follows from the 
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above tree representation that any state different from absorbing states has zero probability 
in the stationary distribution in the zero- noise limit. Moreover, in order to study the zero- 
noise limit of the stationary distribution, it is enough to consider paths between absorbing 
states. More precisely, we construct X-trees with absorbing states as vertices; the family of 
such X-trees is denoted by T(X). Let 

q m (X) = max def(x) ]J P(Y,Y'), (16) 

(Y,Y')ed 

where P(Y, Y') = maxY[(w,w) P(W, W), where the product is taken along any path joining Y 
with Y 1 and the maximum is taken with respect to all such paths. Now we may observe that 
if lim e _ q m (Y)/q m (X) = 0, for any Y ^ X, then X is stochastically stable. Therefore we have 
to compare trees with the biggest products in (16); such trees we call maximal. 
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